Using a priori information for speaker diarization

نویسندگان

  • Daniel Moraru
  • Laurent Besacier
  • Eric Castelli
چکیده

This paper presents an attempt to use supplementary information for audio data diarization. The approach is based on the use of a priori information about the speakers involved in dialogue. Those specific information are the number of speakers involved in conversation, and training data available for one speaker or for all the speakers involved in conversation. The experiments were mainly conducted on the 2003 Rich Transcription Diarization corpus both Dry Run Corpus and Evaluation corpus. The results show that knowing a priori the exact number of speakers seems not to be a very useful information. On the other hand, using a priori speaker models for one or all speakers involved in the conversation, may improve diarization performance when enough data is available to train reliable speaker models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Diarization Using a priori Acoustic Information

Speaker diarization is usually performed in a blind manner without using a priori knowledge about the identity or acoustic characteristics of the participating speakers. In this paper we propose a novel framework for incorporating available a priori knowledge such as potential participating speakers, channels, background noise and gender, and integrating these knowledge sources into blind speak...

متن کامل

Step-by-step and integrated approaches in broadcast news speaker diarization

This paper summarizes the collaboration of the LIA and CLIPS laboratories on speaker diarization of broadcast news during the spring NIST Rich Transcription 2003 evaluation campaign (NIST-RT 03S). The speaker diarization task consists of segmenting a conversation into homogeneous segments which are then grouped into speaker classes. Two approaches are described and compared for speaker diarizat...

متن کامل

Using a GPU, Online Diarization = Offline Diarization

This article presents a low-latency, online speaker diarization system (“who is speaking now?”) based on the repeated execution of a GPU-optimized, highly efficient offline diarization system (“who spoke when”). The system fulfills all requirements of the diarization task, i.e., it does not require any a priori information about the input, including specific speaker models. In contrast to earli...

متن کامل

A hybrid approach to online speaker diarization

This article presents a low-latency speaker diarization system (“who is speaking now?”) based on a hybrid approach that combines a traditional offline speaker diarization system (“who spoke when?”) with an online speaker identification system. The system fulfills all requirements of the diarization task, i.e. it does not need any a-priori information about the input, including no specific speak...

متن کامل

Speaker diarization of spontaneous meeting room conversations

Speaker diarization is the task of identifying “who spoke when” in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization systems have isolated three main issues with the systems; overlapping speech, effects of background noise and speech/nonspeech detection errors on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004